Mock exam 2018 - solutions

Question 1

(a) definition of a Normal form game:

An $N$ player normal form game consists of:

A finite set of $N$ players
Strategy spaces for the players: $\{S_1,S_2,S_3,\dots,S_N\}$;
Payoff functions for the players: $u_i:S_1\times S_2\dots\times S_N\to\mathbb{R}$

[2]

(b) Identifying best responses

We have:

$u_r((1, 0), (y, 1-y))= y - 2 (1 - y)=3y-2$ and $u_r((0, 1), (y, 1-y))= -2y + (1 - y)=1-3y$
$u_c((x, 1 - x), (1, 0))=-2x+(1-x)=1-3x$ and $u_c((x, 1- x), (0, 1))=x-2(1-x)=3x-2$



In [1]:

    
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
plt.figure()
ys = [0, 1]
plt.plot(ys, [3 * y - 2 for y in ys], label="$u_r((1, 0), (y, 1-y))$")
plt.plot(ys, [1 - 3 * y for y in ys], label="$u_r((0, 1), (y, 1-y))$")
plt.legend()
plt.xlabel("$y$")
plt.title("Utility to row player");

[1]



In [2]:

    
plt.figure()
xs = [0, 1]
plt.plot(xs, [1 - x * 3 for x in xs], label="$u_c((x, 1 - x), (1, 0))$")
plt.plot(xs, [3 * x - 2 for x in xs], label="$u_c((x, 1 - x), (0, 1))$")
plt.legend()
plt.xlabel("$x$")
plt.title("Utility to column player");

[1]

The best responses are then given by:

$$ \sigma_r^* = \begin{cases} (0,1)&\text{ if }y< 1/2\\ (1,0)&\text{ if }y> 1/2\\ \text{indifferent}&\text{ otherwise} \end{cases} $$$$ \sigma_c^* = \begin{cases} (1,0)&\text{ if }x< 1/2\\ (0,1)&\text{ if }x> 1/2\\ \text{indifferent}&\text{ otherwise} \end{cases} $$

[1]

(c) Proof of best response condition

$(A\sigma_c^T)_i$ is the utility of the row player when they play their $i$th strategy. Thus:

$$\sigma_rA\sigma_c^T=\sum_{i=1}^{m}{\sigma_r}_i(A\sigma_c^T)_i$$

[2]

Let $u=\max_{k}(A\sigma_c^T)_k$. Thus:

$$ \begin{aligned} \sigma_rA\sigma_c^T&=\sum_{i=1}^{m}{\sigma_r}_i(u - u + (A\sigma_c^T)_i)\\ &=\sum_{i=1}^{m}{\sigma_r}_iu - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i)\\ &=u - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i) \end{aligned}$$

[2]

We know that $u - (A\sigma_c^T)_i\geq 0$, thus the largest $\sigma_rA\sigma_c^T$ can be is $u$ which occurs iff ${\sigma_r}_i > 0 \Rightarrow (A\sigma_c^T)_i = u$ as required.

[1]

(d) Finding the Nash equilibrium

There are no pairs of pure strategies that are best response to each other (visibile from the payoff matrices). We consider mixed strategies $\sigma_r=(x, 1- x)$ and $\sigma_c=(y, 1-y)$ [1]

For $\sigma_r$ to be a best response to $\sigma_c$, from the theorem we have:

$$u_r((1, 0), \sigma_c)=u_r((0, 1), \sigma_c)\Rightarrow 3y-2=1-3y\Rightarrow y=1/2$$$$u_c(\sigma_r, (1, 0))=u_c(\sigma_r, (0, 1))\Rightarrow 1-3x=3x-2\Rightarrow x=1/2$$

[2]

The Nash equilibrium is $((1/2, 1/2), (1/2, 1/2))$.

This is confirmed by the best response findings of question (b): these two strategies are best responses to each other. [1]

(e) Discussing the Knight et al. 2017 paper

(i): This paper builds a Markov queueing model based on strategic interactions between two hospitals [1]. It measures the Price of Anarchy to see the effect on patients of rational behaviour [1]. It does all this by comparing the Nash equilibrium to the optimal behaviour [1].
(ii): The main theoretic result is concerning the behaviour of the best response functions. [1] The paper proves that they are increasing functions which ensures there is a single point of intersection. [1]
(iii): Various possibilities each worth [2]:
- A stationary queueing model whilst in practice a time dependent model would be more appropriate.
- Only considering interaction between two hospitals.
- Just a single cutoff strategy.
(iv): Depending on the choice of limiting factor [3]:
- Use a non stationary queueing model, potentially simulating the queueing process in a time dependent way.
- Consider a multi player game although this becomes harder to generalise the theoretic result.
- Use multiple cutoff strategies (this would be the simplest approach).

Question 2

(a) Definition of repeated game

Given a two player game $(A,B)\in{\mathbb{R}^{m\times n}}^2$, referred to as a stage game, a $T$-stage repeated game is a game in which players play that stage game for $T > 0$ periods. Players make decisions based on the full history of play over all the periods.

[2]

(b) Definition of a strategy in a repeated game

Given a two player stage game $(A,B)\in{\mathbb{R}^{m\times n}}^2$, repeated to give a $T$-stage repeated game. A strategy is a mapping from the entire history of play to an action of the stage game:

$$ \bigcup_{t=0}^{T-1}H(t)\to a $$

where:

$H(t)$ is the history of player of both players up until stage $t$ ($H(0)=(\emptyset, \emptyset)$)
$a$ is an action (for either player) of the stage game

[2]

(c) Size of history space

For a given integer value of $0\leq t< T$ there are $|S_1|^t$ possible histories for player 1 and $|S_2|^t$ possible histories for player 2. Thus the total number of histories is: [2]

$$\sum_{t=0}^{T - 1}|S_1|^t|S_2|^t$$

[1]

which gives:

$$\sum_{t=0}^{T - 1}|S_1|^t|S_2|^t=\sum_{t=0}^{T - 1}(|S_1||S_2|)^t=\frac{1 - (|S_1||S_2|)^T}{1 - |S_1||S_2|}$$

[2]

(d) Listing strategy spaces

$S_1\{r_1, r_2\}$, $S_2=\{c_1, c_2, c_3\}$ and $T=2$

$$\{(\emptyset, \emptyset), (r_1, c_1), (r_1, c_2), (r_1, c_3), (r_2, c_1), (r_2, c_2), (r_2, c_3)\}$$

[3]

(e) Sequence of stage Nash theorem

Consider the following strategy:

The row/column player should play action $a_{r/c}$ regardless of the play of any previous strategy profiles. [2]

where $(a_{r}, a_{c})$ is a given stage Nash equilibrium.

Using backwards induction, this is a Nash equilibrium for the last stage game. Thus, at the last stage, no player has a reason to deviate. Similarly at the $T-1$th stage. The proof follows.

[2]

(f) Pure Nash equilibria

The pure Nash equilibria:

$$A = \begin{pmatrix}\underline{3} & \underline{6} & \underline{1}\\1 & 2 & 0\\\end{pmatrix} \qquad B = \begin{pmatrix}0 & \underline{7} & \underline{7}\\\underline{20} & 1 & 0\\\end{pmatrix}$$

Thus, for our example we have the four Nash equilibria:

$(r_1r_1, c_2c_2)$ with utility: (12, 14). [1]
$(r_1r_1, c_2c_3)$ with utility: (7, 14). [1]
$(r_1r_1, c_3c_2)$ with utility: (7, 14). [1]
$(r_1r_1, c_3c_3)$ with utility: (2, 14). [1]

(g) Note state Nash, equilibria

Consider the following two strategies:

For the row player:

$$(\emptyset, \emptyset) \to r_2$$ $$(r_2, c_1) \to r_1$$ $$(r_2, c_2) \to r_1$$ $$(r_2, c_3) \to r_1$$

[2]
For the column player:

$$(\emptyset, \emptyset) \to c_1$$ $$(r_1, c_1) \to c_3$$ $$(r_2, c_1) \to c_2$$

[2]

This is a Nash equilibria because:

If the row player deviates, they would only be rational to do so in the first stage, if they did they would gain 2 in that stage but lose 5 in the second stage. Thus they have no incentive to deviate.
If the column player deviates, they would only do so in the first stage and gain no utility.

[2]

Question 3

(a) Defining the row/col best response polytopes

For a two player game $(A, B)\in{\mathbb{R}^{m\times n}_{>0}}^2$ the row/column player best response polytope $\mathcal{P}$/$\mathcal{Q}$ is defined by:

[1]

$$ \mathcal{P} = \left\{x\in\mathbb{R}^{m}\;|\;x\geq 0; xB\leq 1\right\} $$

[1]

$$ \mathcal{Q} = \left\{y\in\mathbb{R}^{n}\;|\; Ay\leq 1; y\geq 0 \right\} $$

[1]

(b) Degeneracy of the game

The column player has two best responses to the first row thus the game is degenerate. [1]

(c) Obtain half space definitions:

Directly apply the definition: [1]

Note that any modifications of $A, B$ are accepted (so as to make them $>0$).

For $\mathcal{P}$ :

$$ \begin{aligned} x_1 &\geq 0\\ x_2 &\geq 0\\ 20x_2 & \leq 1\\ 7 x_1 + x_2 & \leq 1\\ 7 x_1& \leq 1\\ \end{aligned} $$

[2]

For $\mathcal{Q}$ :

$$ \begin{aligned} 3y_1 +6y_2 + y_3 & \leq 1\\ y_1 +2y_2 & \leq 1\\ y_1 &\geq 0\\ y_2 &\geq 0\\ y_3 &\geq 0\\ \end{aligned} $$

[2]

(d) Confirming labelling:

There are an infinite number of possible vertices (based on potential modifications of $A, B$). Thus it makes sense to consider the strategies that correspond to the normalised vertices given (recall the question is not asking to find the vertices):

For $\mathcal{P}$:

$x=(0, 0)\text{ with labels: }\{0, 1\}$: $x_1=0$ and $x_2=0$
$x=(1/7, 0)\rightarrow \sigma_r=(1, 0)\text{ with labels: }\{1, 3, 4\}$. Label $1$ because $x_2=0$. Labels $\{3, 4\}$ because the 2nd and 3rd columns are best responses.
$x=(0, 1/20)\rightarrow \sigma_r=(0, 1)\text{ with labels: }\{0, 2\}$: Label $0$ because $x_1=0$. Label $2$ because the first column is a best response.
$x=(19/140, 1/20)\rightarrow \sigma_r=(19/26, 7/26)\text{ with labels: }\{2, 3\}$: we have $\sigma_r B = (70/13, 70/13, 133/26)$ thus columns 1 and 2 are best responses.

[2]

For $\mathcal{Q}$:

$y=(0, 0, 0)\text{ with labels: }\{2, 3, 4\}$: $y_1=0$, $y_2=0$ and $y_3=0$
$y=(0, 0, 1)\rightarrow \sigma_c=(0, 0, 1)\text{ with labels: }\{0, 2, 3\}$: 2, 3 is immediate, 0 because best response to 3rd column is first row.
$y=(0, 1/6, 0)\rightarrow \sigma_c=(0, 1, 0)\text{ with labels: }\{0, 2, 4\}$: 2, 4 is immediate, 0 because best response to 2nd column is first row.
$y=(1/3, 0, 0)\rightarrow \sigma_c=(1, 0, 0)\text{ with labels: }\{0, 3, 4\}$: 3, 4 is immediate, 0 because best response to 1st column is first row.

[2]



In [3]:

    
import nashpy as nash


A = np.array([[3, 6, 1], 
              [1, 2, 0]])
B = np.array([[0, 7, 7], 
              [20, 1, 0]])

row_halfspaces = nash.polytope.build_halfspaces(B.transpose())
for vertex, labels in nash.polytope.non_trivial_vertices(row_halfspaces):
    print(np.round(vertex / sum(vertex), 4))









    



[1. 0.]
[0. 1.]
[0.7308 0.2692]



In [4]:

    
col_halfspaces = nash.polytope.build_halfspaces(A)
for vertex, labels in nash.polytope.non_trivial_vertices(col_halfspaces):
    print(np.round(vertex / sum(vertex), 4))









    



[0. 0. 1.]
[0. 1. 0.]
[1. 0. 0.]

(e) The vertex enumeration algorithm

For a nondegenerate 2 player game $(A, B)\in{\mathbb{R}^{m\times n}_{>0}}^2$ the following algorithm returns all nash equilibria:

For all pairs of vertices of the best response polytopes
Check if the vertices have full labels
Return the normalised probabilities

[2]

(f) Use the vertex enumeration algorithm

The only vertex pairs with a full set of labels:

$x=(1/7, 0)\text{ with labels: }\{1, 3, 4\}$ and $y=(0, 0, 1)\text{ with labels: }\{0, 2, 3\}$ [2]
$x=(1/7, 0)\text{ with labels: }\{1, 3, 4\}$ and $y=(0, 1/6, 0)\text{ with labels: }\{0, 2, 4\}$ [2]

This corresponds to:

$$\{((1, 0), (0, 0, 1)), ((1, 0), (0, 1, 0))\}$$

[1]



In [5]:

    
game = nash.Game(A, B)
for eq in game.vertex_enumeration():
    print([np.round(s, 2) for s in eq])









    



[array([1., 0.]), array([0., 0., 1.])]
[array([1., 0.]), array([0., 1., 0.])]

(g) The L-H Algorithm

For a nondegenerate 2 player game $(A, B)\in{\mathbb{R}^{m\times n}_{>0}}^2$ the following algorithm returns a nash equilibria:

Start at the artificial equilibrium: $(0, 0)$
Choose a label to drop.
Remove this label from the corresponding vertex by traversing an edge of the corresponding polytope to another vertex.
The new vertex will now have a duplicate label in the other polytope. Remove this label from the vertex of the other polytope and traverse an edge of that polytope to another vertex.
Repeat step 4 until the pair of vertices is fully labelled.

(h) Using the L-H Algorithm

Note that the game is degenerate but that we can still attempt to use the algorithm:

$((0, 0), (0, 0, 0))$ have labels: $\{0, 1\}, \{2, 3, 4\}$. Drop 0 (arbitrary decision) in $\mathcal{P}$. [1]
$\to ((1/7, 0), (0, 0, 0))$ have labels: $\{1, 3, 4\}, \{2, 3, 4\}$. In $\mathcal{Q}$ drop 3 (arbitrary decision, could have dropped 4). [1]
$\to ((1/7, 0), (0, 1/6, 0))$ have labels: $\{1, 3, 4\}, \{0, 2, 4\}$. [1]

We have a fully labelled vertex pair (and one of the same equilibria as before).

There are numerous possible implimentations of this algorithm, here is another:

$((0, 0), (0, 0, 0))$ have labels: $\{0, 1\}, \{2, 3, 4\}$. Drop 2 (arbitrary decision) in $\mathcal{Q}$.
$\to ((0, 0), (1/3, 0, 0))$ have labels: $\{0, 1\}, \{0, 3, 4\}$. In $\mathcal{P}$ drop 0.
$\to ((1/7, 0), (1/3, 0, 0))$ have labels: $\{1, 3, 4\}, \{0, 3, 4\}$. In $\mathcal{Q}$ drop 4 (arbitrary decision, could have dropped 3).
$\to ((1/7, 0), (0, 0, 1))$ have labels $\{1, 3, 4\}, \{0, 2, 3\}$

We have a fully labelled vertex pair (and one of the same equilibria as before).

Question 4

(a) Definition of a Moran process on a game

Consider a matrix $A\in\mathbb{R}^{m\times n}$ representing a game with two strategies.

$$ A= \begin{pmatrix} a & b\\ c & d \end{pmatrix} $$

The Moran process is as follows:

At a given time step: all individuals play all other individuals.
Obtain their fitness as given by the game.
Randomly select an individual proportional to their fitness as an individual to be reproduced
Uniformly select an individual to be replaced
Proceed to the next time step.
The process terminates when there is only one type of individual in the population.

[4]

(b) Theorem

Given a birth death process, the fixation probability $x_i$ is given by:

$$x_i=\frac{1+\sum_{j=1}^{i-1}\prod_{k=1}^j\gamma_k}{1+\sum_{j=1}^{N-1}\prod_{k=1}^j\gamma_k}$$

where:

$$ \gamma_k = \frac{p_{k,k-1}}{p_{k,k+1}} $$

[2]

The Moran process is a birth death process where the transition probabilities are then given by:

$$p_{i,i+1}=\frac{if_{1i}}{if_{1i} + (N-i)f_{2i}}\frac{N-i}{N}$$$$p_{i,i-1}=\frac{(N-i)f_{2i}}{if_{1i} + (N-i)f_{2i}}\frac{i}{N}$$

[2]

which gives:

$$\gamma_i=\frac{f_{2i}}{f_{1i}}$$

[1]

thus (using the general birth death process result):

$$ x_i=\frac{1+\sum_{j=1}^{i-1}\prod_{k=1}^j\frac{f_{2k}}{f_{1k}}}{1+\sum_{j=1}^{N-1}\prod_{k=1}^j\frac{f_{2k}}{f_{1k}}} $$

[1]

(c) Moran process for the game

Assuming $i$ individuals of the first type, for this game we have $N=3$ and $(a, b, c, d)=(0, 2, r, 0)$ the fitness of both types is given respectively by:

$$f_{1i}=\frac{a(i-1)+b(N-i)}{N-1}=\frac{6-2i}{2}=3-i$$$$f_{2i}=\frac{c(i)+d(N-i-1)}{N-1}=\frac{ri}{2}=\frac{ri}{2}$$

which gives:

$$\gamma_i=\frac{f_{2i}}{f_{1i}}=\frac{ri}{6-2i}$$

thus:

$$ x_1=\frac{1}{1+\sum_{j=1}^{2}\prod_{k=1}^j\frac{rk}{6-2k}}=\frac{1}{1 + \frac{r}{4} + \frac{r}{4}\frac{2r}{2}}=\frac{1}{\frac{r^{2}}{4} + \frac{r}{4} + 1} $$

for $r=2$ we get:

$$ x_1 = 2 / 5 $$

Some code to verify the result:



In [6]:

    
def theoretic_fixation(N, game, i=1):
    """
    Calculate x_i as given by the above formula
    """
    f_ones = np.array([(game[0, 0] * (i - 1) + game[0, 1] * (N - i)) / (N - 1) for i in range(1, N)])
    f_twos = np.array([(game[1, 0] * i + game[1, 1] * (N - i - 1)) / (N - 1) for i in range(1, N)])
    gammas = f_twos / f_ones
    return (1 + np.sum(np.cumprod(gammas[:i-1]))) / (1 + np.sum(np.cumprod(gammas)))



In [7]:

    
import sympy as sym
import numpy as np
r = sym.symbols("r")
game = np.array([[sym.S(0), sym.S(2)], [sym.S(r), sym.S(0)]])
x_1 = theoretic_fixation(N=3, game=game)
x_1, x_1.subs({r: 2})









    Out[7]:





(1/(r**2/4 + r/4 + 1), 2/5)

(d) Finding an $r$

Finding $r$ such that $x_1>.9$ corresponds to:

$$\frac{1}{\frac{r^{2}}{4} + \frac{r}{4} + 1}>9/10$$

which corresponds to:

$$ \begin{aligned} \left(\frac{r^{2}}{4} + \frac{r}{4} + 1\right)9/10 & < 1 &&\text{[1]}\\ \left(\frac{r^{2}}{4} + \frac{r}{4} + 1\right) & < 10/9 && \text{[1]}\\ \frac{r^{2}}{4} + \frac{r}{4} - \frac{1}{9}& < 0 && \text{[1]} \\ r ^ 2 + r - \frac{4}{9}& < 0 && \text{[1]} \\ \end{aligned}$$

The roots of this polynomial are $-4/3, 1/3$ [2], thus: $r<1/3$ ensures $x_1>9/10$. [1]

To ensure a high probability of fixation we need the fitness of an individual of the second type encountering an individual of the first type to be at most $1/3$. [1]



In [8]:

    
sym.solveset(r ** 2 + r  - sym.S(4) / 9, r)









    Out[8]:





{-4/3, 1/3}